-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OpenCL] Refactor cl_program generation #7834
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
for each kernel. This can avoid pathological bugs in the vendor specific OpenCL compiler that may be triggered with large programs.
tqchen
requested changes
Apr 13, 2021
jroesch
reviewed
Apr 13, 2021
tqchen
reviewed
Apr 26, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nits, other parts lgtm
tqchen
approved these changes
Apr 26, 2021
Thanks @csullivan This is merged |
umangyadav
pushed a commit
to umangyadav/tvm
that referenced
this pull request
May 5, 2021
* Refactor OpenCL runtime module to build separate cl_programs for each kernel. This can avoid pathological bugs in the vendor specific OpenCL compiler that may be triggered with large programs. * clang-format * Remove check on program size when deconstructing. * Refactor into SplitKernels method. * Limit number of loops for kernel parsing * Add return doc for SplitKernels per CR.
trevor-m
pushed a commit
to trevor-m/tvm
that referenced
this pull request
May 6, 2021
* Refactor OpenCL runtime module to build separate cl_programs for each kernel. This can avoid pathological bugs in the vendor specific OpenCL compiler that may be triggered with large programs. * clang-format * Remove check on program size when deconstructing. * Refactor into SplitKernels method. * Limit number of loops for kernel parsing * Add return doc for SplitKernels per CR.
trevor-m
pushed a commit
to trevor-m/tvm
that referenced
this pull request
May 6, 2021
* Refactor OpenCL runtime module to build separate cl_programs for each kernel. This can avoid pathological bugs in the vendor specific OpenCL compiler that may be triggered with large programs. * clang-format * Remove check on program size when deconstructing. * Refactor into SplitKernels method. * Limit number of loops for kernel parsing * Add return doc for SplitKernels per CR.
trevor-m
pushed a commit
to trevor-m/tvm
that referenced
this pull request
May 6, 2021
* Refactor OpenCL runtime module to build separate cl_programs for each kernel. This can avoid pathological bugs in the vendor specific OpenCL compiler that may be triggered with large programs. * clang-format * Remove check on program size when deconstructing. * Refactor into SplitKernels method. * Limit number of loops for kernel parsing * Add return doc for SplitKernels per CR.
trevor-m
pushed a commit
to neo-ai/tvm
that referenced
this pull request
May 11, 2021
* Refactor OpenCL runtime module to build separate cl_programs for each kernel. This can avoid pathological bugs in the vendor specific OpenCL compiler that may be triggered with large programs. * clang-format * Remove check on program size when deconstructing. * Refactor into SplitKernels method. * Limit number of loops for kernel parsing * Add return doc for SplitKernels per CR.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I have encountered a few pathological bugs in the opencl compiler provided on the snapdragon android platform (e.g. opencl compiler hung for 5+ hours in call to clBuildProgram, and non-deterministic emission of
cl_a6x_cmdbuf_mgr_submit_ibs
). I've isolated them into a minimal reproducible example, and find that they occur only when all kernels are created from a single cl_program. If instead a cl_program is created for each kernel, these issues are avoided.This PR proposes the addition of a kernel primitive delimiter to be added to the OpenCL code generation, and for the OpenCL module runtime to utilize this delimiter to build and cache separate cl_programs for each generated kernel source.